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ABSTRACT 



^N) Context. The conditions of the upper atmosphere can change rapidly in response to the solar and 

geomagnetic activity. Among several heliophysical and geophysical quantities, the accurate evo- 
lution of the solar irradiance is fundamental to forecast the evolution of the neutral and ionized 
components of the Earth's atmosphere. 

Aims. We developed an artificial neural network model to compute the evolution of the solar ir- 

I radiance in near-real time. The model is based on the assumption that that great part of the solar 

2 irradiance variability is due to the evolution of the structure of the solar magnetic field. 

C/5 Methods. We employ a Layer-Recurrent Network (LRN) to model the complex relationships be- 

' ' tween the evolution of the bipolar magnetic structures (input) and the solar irradiance (output). 

, The evolution of the bipolar magnetic structures is obtained from near-real time solar disk magne- 

^ tograms and intensity images. The magnetic structures are identify and classified according to the 

00 

area of the solar disk covered. We constrained the model by comparing the output of the model 
CO and observations of the solar irradiance made by instruments onboard of SORCE spacecraft. Here 

we focus on two regions of the spectra that are covered by SORCE instruments. While the range 
I from 115 nm to 310 nm is covered by the two SOLSTICE instruments, with a resolution of 1 nm, 

7—i XPS instrument measures and Lyman-alpha observations are combined to produce the spectra 

" . I from 0. 1 to 34 nm. 

. 5^ Results. The generalization of the network is tested by dividing the data sets on two groups:the 

training set; and, the validation set. We have found that the model error is wavelength dependent. 
While the model error for 24-hour forecast in the band from 115 to 180 nm is lower than 5%, 
the model error can reach 20% in the band from 180 to 310 nm. The performance of the network 
reduces progressively with the increase of the forecast period, which limits significantly the max- 
imum forecast period that we can achieve with the discussed architecture. 

Conclusions. The model proposed allows us to predict the total and spectral solar irradiance up 
to three days in advance. The near real-time forecast of the total and spectral solar irradiance 
available at htt p://www.lpc2e.cnrs-orleans.fr/~soteria . 

Key words. Sun: activity; Sun: faculae, plages; Sun: surface magnetism; Sun: solar-terrestrial 
relations; Sun: sunspots; Sun: UV radiation 
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1. introduction 

The spectrally resolved solar iiTadiance (or SSI, for Solar Spectral IiTadiance) is one of the key 
parameters for solar-terrestrial science, and in particular for space weather and space climate. The 
SSI in the visible and near-infrared bands is mostly absorbed in the troposphere and at ground, 
where it heats directly. The UV band, is predominantly absorbed in the stratosphere and above. 



where photolysis leads to more complex chain of mechanisms (Haigh 2007 Gray et al. 2010 1. 
In particular, the Extreme-UV (EUV, 10-121 nm) is the main ionisation source of the ionosphere; 
any changes in the UV and EUV thus directly impact the Earth's middle and upper atmosphere. 
Some of their societal consequences are: increased satellite drag due to heating of the thermosphere, 
perturbation of ground-satellite communications due to changes in the ionospheric electron density, 
and on the longer-term, impact on climate change. For that reason, the continuous and long-term 
monitoring of the SSI has become one of the most important issues toward the quantification of the 
impact of solar variability on the Earth's environment ( Lean|2005 



Unfortunately, the continuous monitoring of the SSI is a major challenge as it has to be done 
outside of the terrestrial atmosphere, where instruments suffer from degradation and the almost 
total lack of in-flight calibration. First continuous observations of the SSI really started with the 
SORCE mission in 2003 only, as this mission, together with TIMED, was the first to provide a 
complete and continuous coverage of the solar spectrum, from the soft X-ray (XUV, 1-10 nm) to 



the near infrared. These observation have led to series of unexpected discoveries (Harder et al. 
2009[ l. Unfortunately SORCE now is approaching the end of of its mission life. 



To compensate for this severe lack of direct observations, various empirical and semi-empirical 
SSI models have been developed, especially in the EUV range, which is crucial for space weather 
( [Lilensten et a/.|[2008| l. All these models rely on various proxies for solar activity ( |Krivova and| 
Solanki|[2008 1. Various empirical models have been successfully using solar indices such as Mg 



II and flO.7 to reconstruct the SSI ( |Lean|[2000l [Tobiska and Bouwer|[2006l [Lean et alpHU) . A 
dififerent class of semi-empirical models relies on solar continuum images and solar magnetograms 
( |Fhgge et al.\\l99S[ [Krivova, Vieira, and Solariial[20T0l [Ball et al.pmT\ [Fontenla et a7][20n] ). 
The validity of these models has recently been challenged by the (presumably) anomalous spectral 
variability observed by SORCE in the near-UV ( |Haigh et a/.||2010| ). Most of these models are 
based on the premise that the variability in the SSI is driven by surface magnetism only; this 
hypothesis has been remarkably well confirmed so far, but its validity for long-term (i.e. centennial 



and beyond) variations is still an open issue (FrohUch 201 1 Shapiro et al. 201 1 Vieira et al. 201 1 1. 



Note that none of them can properly reproduce short transients such as flares, for which dedicated 
models exist ( Chamberlin, Woods, and Eparvier|2008 1. In what follows, we shall also exclude such 



transients and concentrate on time scales exceeding one hour. 

While SSI models are crucial for understanding the physical mechanisms that are responsible 
for the variation of the solar radiative output ( Domingo et a/.|[2009| l, they are also increasingly 



required for more applied purposes, namely for space weather applications. Unfortunately, opera- 
tional requirements bring in a number of constraints that cannot always be met by research-grade 
models. These include latency (the SSI must be available in near real-time) and continuity (backup 
options are required if there are service outages). The capacity of operational models to provide 
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forecasts also becomes an important issue. As of today, only the commercial Solar2000 model 
( Tobiska and Bouwer|2006 1 is used for operational purposes. Solar2000 uses various solar proxies 
to deliver SSI reconstructions covering the EUV to the near-infrared. 

In the framework of the European collaborative project SOTERLA]^ we developed a non- 
commercial SSI model that meets the constraints of operational use while keeping the high physi- 
cal performance of semi-empirical models that rely on solar surface magnetism. Our model, which 
will be detailed below, uses continuum images and magnetograms from Helioseismic and Magnetic 
Imager (HMI) instrument ( Schou et a/. [201 1 1 on board of the Solar Dynamics Observatory (SDO), 
which are segmented into various types of solar features. The filling factors associated with these 
features, and their centre-to-limb position are fed into an artificial neural network that has been 
trained using SSI observations from SORCE instruments. The present version of the model fo- 
cuses on the nowcast and short-term (1-3 days) forecast of XUV to UV ranges only (also including 
the TSI - Total Solar Irradiance), whereas a upcoming version will extend the range to the visible 
and use a flux transport model to provide one month ahead forecasts. 

The paper is structured as follows. In Sect. 2 we describe the data sets employed as the models 
input and target. The the neural network model, the feature extraction procedure, the preprocessing, 
the training, and the validation are presented in Sect. 3. In Sect. 4 we present the results. The 
conclusions are given in Sect. 5. 



2. Data 

We employ solar disk magnetograms and continuum images obtained by the HMI/SDO instruments 
to track the evolution of bipolar magnetic regions, which are available at the HMI - AIA Joint 
Science Operations Center - Science Data Processing websit^ We use the quicklook version of 
the images, 1024x1024 pixels that are available with a latency of 15 minutes. Unfortunately, the 
cahbrated images are delivered days to weeks later, and so can not be used. The images are provided 
in the jpg (joint photographic experts group) format in intensity levels instead of physical units. We 
point out that by using this format we may overestimate the area of the solar disk covered by bipolar 
magnetic regions because of the non-linear relationships between the magnetic field intensity of a 
given pixel and the actual area of that pixel covered by magnetic structures. However, the artificial 
neural network model is able to deal with these non-linearities. 

We have selected images with 1024x1024 pixels because of the computational limitations for 
the image processing and feature extraction. Although we have employed only HMI/SDO images 
in this analysis, observations from other instruments can be used to compute the fraction of the 
solar disk covered by magnetic structures. We stress that the identification of the filling factors 
improves significantly if images with 4096x4096 are employed. 

We employ observations of the total and spectral solar irradiance from instruments on board of 
the Solar Radiation and Climate Experiment (SORCE) spacecraft (e.g. Rottman|2005) l. The data is 



available at the SORCE websit^ We use measurements from TIM instrument to monitor changes 
of the total solar irradiance (TSI) with a time cadence of 6 hours. We employ measurements from 



' Solar Terrestrial Interactions and Archives (2008-201 1), 
^ http://jsoc.stanford.edu/ 
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SIM, SOLSTICE, and XPS instruments to monitor the evolution of the irradiance in different bands 
of the solai- spectrum, covering the XUV (0. 1-lOnm), part of the EUV (up to 40 nm), the FUV (121- 
180 nm) and the MUV (180-300 nm). The gap in the EUV will soon be filled by using data from 
TIMED/SEE. The time cadence of the data is 1 day. 



3. Approach 

We employ an artificial neural network model to compute in near-real time the evolution of the 
solar irradiance from the distribution of magnetic active regions on the solar disk. This approach 
is based on the assumption that the solar iiTadiance variability is predominantly, if not entirely, 
caused by the evolution of the structure of the solar magnetic field ( [Krivova et a/.|[2003 i. As the 
3-dimensional structure of the magnetic field is imprinted on the solar surface, solar disk mag- 
netograms and continuum images are employed to track its evolution. The emission at a given 
wavelength is computed from the distribution of dark and bright features on the solar disk. It is 
sufficient to segment the solar disk in three groups to reproduce with high precision the evolution 
of the total and spectral solar irradiance. The groups needed for capturing the salient features of the 
solar spectral variability are: the quiet sun, sunspots (umbra and penumbra), and bright magnetic 
structures (faculae and the network). 

Models such as the SATIRE (Spectral And Total Irradiance Reconstruction) models are also 
based on the same assumption and have been successfully employed to compute the evolution of 
the solar irradiance from days to millennia Krivova et al. ( 2003| l. However, while SATIRE models 
employ the intensities of each atmospheric component to compute the emission at a given wave- 
length, we use a more empirical and data-driven approach based on an artificial neural network 
(ANN). The main advantage of using an ANN instead of the intensities of each atmospheric com- 
ponent is the flexibility to recognize and predict temporal patterns from near-real time solar disk 
magnetograms and intensity images. The model, however, needs to be trained with real data so our 
assumption here is that the spectra from SORCE are indeed the true representation of the SSI. 



3.1. Description of ttie Artificial! Neural network mode and Datal 

We employ a Layer-Recurrent Network (LRN) to model the complex relationships between the 
evolution of the bipolar magnetic structures (input) and the solar irradiance (output). We follow 
the original structure proposed by Elman (1990i. Figure [T] presents a schematic representation 
of the LRN architecture. The network has two layers with one neuron in each layer. While the 
network uses a nonlinear transfer function (hyperbolic tangent - tank) for the first layer, a linear 
transfer function is employed for the output layer. Additionally, there is a feedback loop, with a 
single delay, around the first layer. This feedback introduces memory in the system and helps the 
stabilizing the spectra when the inputs suffer from outliers. 

The output of the first layer (ai,,) at the discrete time instant f, is computed taking into account 
weighted input (Wp), the bias (bi) and the weighted feedback (Loi ,_i). The input of the network 
is the n-element vector p. The elements of the input vector pj are multiplied by weights wj. The 
weighted values are then summed. In this way, we can express mathematically the output of the 
first layer as 

aij — tanh(Wp + Lai_,_i + b\) (1) 
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Neural network model 
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Fig. 1. Schematic representation of the artificial neural network architecture. See text for a detailed 
explanation of the network structured. 



The output of the second layer («].,), which is the irradiance at a given wavelength and at a given 
discrete time ti)), is represented by a linear combination 



a2,i - Mai^i + b2 



(2) 



In principle, solar disk magnetograms and intensity images can be employed directly as the 
input of the network. However, this would imply in a very large number of coefficients to be de- 
termined, which is not computationally efficient. One alternative is the transformation of the input 
images into a set of features. This process, which is known as feature extraction, is a form of dimen- 
sionality reduction. In our case, the simplest way to proceed is the determination of the fraction of 
the disk covered by magnetic structures, the filling factors. The algorithm employed to determine 
the filling factors is described in detail in the next section. 

The coefficients W, L, M, hi, and b2 are determined by minimizing a combination of squared 
errors and weights MacKay ( 1992| l. This process is also known as Bayesian regularization. In 
this way, we determine the set of coefficients to produce a network that generalizes properly the 
relations between the input and the output data. 

The algorithm described above can be applied for nowcast as well as for short-term forecast up 
to two days. Here, we define nowcast as a shoit-term forecast out to six hours. 



3.2. Feature extraction procedure 

Here we employ solar disk magnetograms and continuum images to identify the magnetic active 
regions and sunspots, respectively. An example of such observations is provided in Figures |2^-b, 
which show the observations of the solar disk obtained by HMI instruments on 04-Aug-2011 at 
9:00 UT. Large magnetic active regions and sunspots are present in the northern hemisphere. The 
feature extraction procedure consists of the identification of the quiet Sun, magnetic active regions, 
and sunspots. 
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The method to identify magnetic active regions in the disk magnetograms includes the fol- 
lowing steps: (a) identification of the disk pixels; (b) segmentation of the image; (c) connected- 
component labelling; (d) computation of the area of each object identified in the binary image; (e) 
removal of small area objects. 

The identification of the disk pixels is made taking into account that most of the pixels outside 
of the solar disk correspond to black pixels (/„,[', 7] = 0). In this way, we segment the image 
employing a simple threshold. Note that by applying this threshold the annotation's pixels outside 
of the disk will also be identified as disk pixels. In order to deal with this problem, we search for 
the feature with the largest area. 

To distinguish the magnetic active regions from the quiet Sun, we segment the disk magne- 
tograms iXm,p) employing by a threshold 



if \X,„ p[i, j] - X,„ o\ < X,„ ,h, 

(3) 

1 if \Xm,p[i, j] - X,„fi\ > X,„jh. 



where X^a is the intensity level that corresponds to Gauss and X,„ is the threshold. One example 
of the binary image resulting from the segmentation is shown in Figure[2};. White regions represent 
magnetic active regions, the quiet Sun is shown in black. Next, we perform a connected-component 
labeling, which is a method for identifying each object in a binary image. The connectivity is four 
(4), which means that we search for 4-connected neighborhood. The area of each object, which is 
employed to classify the objects, is then computed. We remove objects with small areas (less than 
10 pixels). 

We identify sunspots (umbrae and penumbrae) in the solar disk intensity images. The procedure 
is similar to the one used to identify magnetic active regions. However, as sunspots are dark features 
in the solar disk, we search for pixels that are below a given threshold. In order to distinguish 
umbrae and penumbrae, we apply two thresholds. 



Xc,A^,j] = 



if(X,,p[i,j]-X,fl)>X,,H,„ 

1 if (X,,p[i, j] - X,fl) < X,,h„ and iX,,p[i, j] - X,,o) > X,,„,„, (4) 

2 if(X,,p[i,j]-X,fl)<X,„,_, 



where Xc^ is the reference level and Xcjh,, and X^jh^^ are the thresholds for umbrae and penumbrae, 
respectively. An example of a binary image obtained is presented in|2|l. The sunspots identified are 
presented in white. 

In Figure|2}; we observe that bipolar regions occur in a continuum size spectrum Harvey ( 1993 1. 



However, it is convinient to divide the spectrum into active regions and ephemeral regions. Here 
we divide the spectra into four classes according to the filling factors of individual structures, 
i.e., the fraction of the solar disc covered by an individual magnetic structure. The classes are 
determined from the empirical cumulative distribution function (ECDF) of the features identified 
from Sep/2010 to Dec/2010. Note that the ithe thresholds employed for the segmentation of the 
imagess are fixed taking into account that the noise should be removed. The threshold values affect 
the distribution of the filing factors of individual structures. 

In Figure|2}; we observe that bipolar regions occur in a continuum size spectrum Harvey ( 1993 1. 



However, it is convenient to consider separately the spectral contribution from active regions and 
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ephemeral regions. Here we divide the spectra into four classes according to the filling factors of 
individual structures, i.e. the fraction of the solar disc covered by an individual magnetic structure. 
These classes are determined from the empirical cumulative distribution function (ECDF) of the 
features identified from Sep/2010 to Dec/2010. Figure [3] presents the ECDF obtained (blue line). 
The boundaries between the four classes are defined approximately at the probability levels: 33.3%, 
66.6%, and 97%. Table [T] shows the resulting classification according to the filling factors of the 
structures. Following this classification scheme, we produce an image mask in which the active 
regions and the sunspots are represented. Figure|4]displays an example of the image mask produce 
from the magnetogram and intensity image of 04-Aug-201 1 at 09:00:00. 

Table 1. Classification scheme of the bipolar magnetic structures according to the filling factors. 

Class Filling Factors (ppm) 

I a < 16.7 

II 16.7 <a< 24.8 

III 24.8 < a < 89.1 

IV or > 89.1 



The contribution of the solar features to the solar irradiance also depends on the position of 
the features on the solar disk. In order to take this in to account, we compute the area covered by 
bipolar features in concentric rings. These rings are determined according to the heliographic angle 
ifi). Figure [5] shows the eleven (11) rings considered in this work. As we will show later, there is 
no need for increasing that number. 

The input vector {p) of the network is defined as the filling factors of the 10 inner rings of each 
class considered. Following a common pratice, we normahze the input time series proportionally 
to the standard deviation before they enter in the neural network. Note that the thresholds employed 
for the segmentation of the images are fixed taking into account that the noise should be removed. 
Although the threshold values afl'ect the distribution of the filing factors of individual structures, the 
training procedure accommodate the ANN coefficients in order to generalize properly the output. 

4. Results and Discussions 

4. 1 . Evolution of the Filling factors 

Figure|6^ shows the evolution of the various filling factors from Sep/2010 to Oct/2011. Each line 
presents the fraction of the solar disk covered by structures that belong to one class, i.e. the filling 
factors of each class. The yellow, red, and green lines display the evolution of ephemeral regions 
(ER) that are members of the Classes I, U, and III, respectively. The blue line exhibits the evolution 
of active regions (AR), which are structures with filling factors larger than 89.1 ppm (Class IV). 
The brown and black lines show the evolution of the sunspots penumbrae and umbrae, respectively. 
Note that for a better visualization, the filling factors of penumbrae and umbrae were multiplied by 
a factor of five (5). 

We find that the filling factors of Classes I and II structures do not present trends during the 
period considered. However, an increase of the filling factors of Class I structures in relation to 
the average value is observed from 29-Oct-2010 to 07-Mar-2011. Changes of the filling factors 
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Fig. 2. SDO/HMI solar disk line of sight magnetogram (a) continuum intensity image (b) mea- 
sured on 04-Aug-2011 09:00:00. (c) Bipolar regions identified in the magnetogram. (d) Sunspots 
identified in the continuum image. 



of Class II structures are observed near the boundaries of this interval. These changes appear as 
decreases of the filling factors of Class II structures from 29-Oct-2010 to lO-Nov-2010 and from 
15-Feb-2011 to 07-Mar-201 1. We speculate that these variations are due to small changes in the 
calibration of the magnetograms or the mapping of the quicklook images. In this interval, we did 
not observe any such discontinuity in the filling factors of Classes III and IV structures. We also 
find that oscillations with periods longer than few days are not present in the time series of Classes 
I and n, which suggests that they are quasi-uniformly distributed on the solar surface. Although 
this result has to be confirmed with higher resolution and calibrated magnetograms and intensity 
images, it implies that the evolution of small size structures {a < 24.8 ppm) does not contribute 
significantly to medium-term changes of the solar surface magnetic flux (i.e. on time scales of 
months). Consequently, long-term changes of the solar surface magnetic flux seem to be due only 
to the evolution of structures with filling factors larger than 24.8 ppm. Furthermore, assuming that 
changes of the solar irradiance are due only to the evolution of the magnetic field structure, the 
long-term evolution of the solar irradiance may then be constrained only by the evolution of Class 
III and IV structures. 
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Fig. 3. Empirical cumulative distribution function of the filling filing factors of bipolar magnetic 
active regions identified in the magnetograms from 19-Set-2010 to Ol-Mar-2010. The red lines 
indicate the filling factors at 0.33, 0.66, and 0.97. The four classes are marked in the figure. 
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Fig. 4. Distribution of bipolar magnetic regions identified on 04-Aug-2011 09:00:00. The color 
scheme indicates the four classes of active regions considered in this work. The umbrae and penum- 
brae regions are also indicated in the figure. The quite sun is shown in gray. 
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Fig. 5. Distribution of the concentric rings employed to take into the account the center-to-limb 
contrast of the bipolar regions. 

A quasi-periodic 27-day modulation of the filling factors of active regions is observed from 
the middle of Set/2010 to the middle of Jan/2011. The maxima and minima values remained at 
the same level during this period. The heliographic longitude of the disk center during the peaks 
of the filling factors of active regions is in the sector between 100 and 200 degrees (see Fig. [S]?). 
After a short period of low activity, the area covered by active regions increased approximately four 
(4) times from Jan/201 1 to Nov/201 1 . The increase of the complexity of the signal after Jan/201 1 
reflects the emergence of active regions in the longitudinal sector from -50 to 50 degrees. In the 
same way, the filling factors of the Class III structures increased by about 50% from Sep/2010 
to Oct/2011. These patterns for the increase of the area covered by active and ephemeral regions 
are in agreement with previous studies suggesting that ephemeral regions evolve in cycles that are 
longer than sunspot cycles. Consequently, as the ephemeral regions evolve in cycles that overlap 
significantly, the variation through the 1 1 -year cycles is lower than the variation of the area covered 
by active regions. 

In addition to the presence of active regions on the solar disk, its spatial distribution affects 
directly the variability of the solar irradiance. Figure [8] shows the evolution of the filling factors 
of the four (4) inner rings. Panels (a) to (d) present the evolution of the filling factor of Classes I, 
II, III, and IV structures, respectively. The evolution of the sunspots penumbrae and umbrae are 
displayed in panels (e) and (f), respectively. While the changes of the area covered by large active 
regions and sunspot groups are easily noticeable when considering individual rings (see Fig. [8}l- 
f), the filling factors of Classes I and II structures do not present any significant modulation. The 
filling factors of Class III structures are also affected by the presence of active regions, but these 
structures are observed in a wider region of the solar disk. 

Figure |7] exhibits a comparison between the evolution of active regions and the solar irradi- 
ance. Panels (a) and (b) present the filling factors for large active regions and sunspot umbrae, 
respectively. Panel (c) displays the evolution of the total solar irradiance (blue line) and Lyman-c 
emission. The distribution of active regions on the solar disk prior, during, and after the passage of 
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the sunspot groups are presented in the bottom of the figure. Note that during the transit of sunspots 
on the solar disk the total solar iiTadiance reduces significantly as expected. The transit of the ac- 
tive regions from their appearance on 2011-07 -26 to the maximum on 201 1-08-01 is observed as a 
progressive increase of the filling factors from the limb to the center rings. At the same time, the 
TSI decreases progressively, reaching its minimum value around 201 1-08-01 12:00 when three (3) 
large active regions in the northern hemisphere are near the disk center. A gradual decrease of the 
area covered by the active regions/sunspots and the concomitant increase of the TSI occurs from 
the 2011-08-01 to 2011-08-10. However, as the pattern of the Lyman-or emission illustrates, the 
presence of active regions produce different patterns in diff'erent bands of the spectrum. In con- 
trast to the decrease of the TSI during the passage of the active regions, a clear enhancement of 
the Lyman-a emission is observed from 2011-07-26 to 2011-08-10. A large enhancement of the 
Lyman-a emission is also observed from 2011-07-11 to 2011-07-26 without a concomitant de- 
crease of the TSI. In this way, the feature extraction procedure is able to identify the presence and 
the distribution of the active regions/sunspots that are needed to compute the evolution of the solar 
irradiance employing a neural network. 

4.2. Short-term forecast of solar irradiance 

As discussed in Sect. 3, the coefficients of the network are obtained by comparing the output of the 
model and observations of the solar irradiance made by instruments onboard of SORCE spacecraft. 
Here we focus on two regions of the spectra that are covered by SORCE instruments. While the 
range from 1 15 nm to 310 nm is covered by the two SOLSTICE instruments, with a resolution of 
1 nm, the XPS instrument measures spectra from 0. 1 to 34 nm. 

Figure|9]shows a comparison between the observations of the irradiance at 121.5 nm measured 
by the SOLSTICE instrument and the output of the 24-hour forecast model. The red and blue 
lines in the upper panel present the contribution of ephemeral regions. Classes II and III, respec- 
tively. The green line displays the contribution of active regions (Class IV). Figure [9J) displays the 
contribution of umbrae (red line) and penumbrae (blue line). The contributions are computed by 
multiplying each element of the input vector by the weights w,. The weighted elements of each 
class are them summed and the total contribution of each class is obtained. Figure |9}; shows the 
feedback contribution (Laij-i). The large this value is (in absolute term) the more the ANN relies 
on past values to estimate the present one. That is, a large feedback implies a reconstruction based 
on persistence. In this particular example (Fig.[9j;), the feedback is negligible. Figure [9}l presents 
the time series from SOLSTICE (blue line), the neural network output (red line), and the neural 
network output with a linear transfer function for the first layer (green line). The training (80%) and 
validation (20%) sets are indicated in the figure. The model reproduces adequately the variability of 
the training set as well as the validation set, which indicates that the model properly generalizes the 
relations between the distribution of bipolar magnetic features on the solar disk and the Lyman-o' 
emission. Note also that the output of the linear model, in which each neuron has a linear response, 
in most cases, performs as well as the nonlinear one, except with larger excursions of outliers. 

The coefficients of the model are shown in Figure [T0| The coefficients of each class considered 
are indicated as well as the coefficient corresponding to the inner ring. In this example, most of the 
contribution for the evolution of the irradiance is due to the evolution of the large active regions. As 
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expected, the major contribution to the variability of the Lyman-Q- emission is due to the evolution 
of active regions (Class IV), although the feedback also contributes. Not surprisingly, this contri- 
bution mainly comes from active regions that are near the center of the disk. This property is, of 
course, wavelength dependent. Incidentally, because the model is data driven, we now can use it to 
infer properties about the radial contribution of specific features for each wavelength. This opens 
interesting perspectives that will be investigated in a forthcoming publication. 

The percentual difference between the output of the for 24-hour forecast model and the obser- 



vations, the model error, is presented in Fig. 1 1 for the training (blue line) and validation (green 



line) sets from 1 15 to 3 10 nm. It is noticeable that the model error of the MUV region of the spectra 
is higher than in the FUV. The model error for the XPS region of the spectra, which is not shown 
in the figure, is comparable to the error of the FUV region. 

Examples of training sections for the total solar irradiance for forecast periods from 12 hours 
to 72 hours are displayed in Figures T2p3 The structure of these figures is the same of the Fig. 



[9] As expected, the model error increases as the forecast period, i.e. the time interval for which 
a forecast is made, increases. Moreover, while the contribution of the feedback for predictions up 
to 48 hours is negligible, the contribution of the feedback for 72-hour forecast is significant. This 
indicates that, as predictable, 72-hour forecasts rely heavily on past observations. 

The increase of the model error in function of the forecast period and the progressive depen- 
dence on past observations are related to the emergence and decay of the magnetic active regions 
as well as their transit on the solar disk as seen from our vantage point of view. The evolution 
of the magnetic active regions affects the structure of the solar atmosphere and its electromag- 
netic emission, which are not predictable employing the ANN architecture discussed in this paper. 
Additionally, the solar surface does not rotate as a solid body, which is an implicit assumption 
in this architecture. These sources of errors can be reduced employing a surface magnetic flux 
transport model as briefly described in the next subsection. 

4.3. Subsequent steps toward a medium-term forecast 

In addition to short-term forecasts, predictions of the solar irradiance on time scales of months 
are needed for space weather applications, such as the evaluation of the upper atmosphere and 
thermosphere conditions. 

In principle, the artificial neural network described in Sect. 3 can be adapted for the forecast 
of the solar activity on the time scales of months. The adaptation would involve the increase of 
the complexity of the model by increasing the number of neurons and/or layers. One alternative is 
predicting the distribution of the magnetic structures by employing a solar surface magnetic flux 
transport model ( Jiang et a/.|2010| and them apply the algorithm described in Sect. 3. 



In a magnetic flux transport model, the variation of the radial component of the surface mag- 
netic flux can by computed by 
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Fig. 6. (a) Evolution of the filling factor of the several components. The evolution of large active 
regions is presented in blue, while the evolution of regions with filling factors between 24.8 and 
89.1 ppm are shown in green. The red and yellow lines display the evolution of regions with filling 
factors lower than 24.8 ppm. The evolution of the penumbrae and umbrae are presented in brown 
and black, respectively, (b) Heliographycal longitude of the disk center during the peaks of filling 
factors of large magnetic active regions. 



where S(6,(p, Class,) is the source temi of the different classes and D,(rirBr) is the decay term 
parameterizing the radial diffusion of the magnetic field, rji, and //, are the horizontal and radial dif- 
fusivity, respectively. While v{6) is the meridional flow, (l>{0) is the latitudinal differential rotation. 
The starting point are synoptic -like charts of the distribution of the surface magnetic field com- 



puted from the disk magnetograms measured during one solar rotation. Figure 16 3 shows an exam- 
ple of the distribution of magnetic structures for one solar rotation. The visible disk on 2011-1 1-27 
at 33:30 is presented on panel (b). The source term S(6,<p,t, Class) is determined based on the 
statistical properties of the classes for the previous rotations at a given heliografic sector. 



5. Concluding remarks 

In the present work, we have developed an artificial neural network model to predict the short-term 
evolution of the solar irradiance based on near-real time observations of the solar surface magnetic 
field. We have shown that the total and spectral solar irradiance can be predicted up to three days 
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with high accuracy by considering the evolution and distribution of magnetic structures on the solar 
disk. 

In order to reduce the dimensionality of the problem, we have employed a feature extraction 
algorithm to identify and classify the magnetic structures observed on the solar disk. The classifica- 
tion scheme is based on the empirical distribution function of the fraction of the solar disk covered 
by individual structures. In this way, we have considered separately the evolution of large active 
regions and small short-lived ephemeral regions. Additionally, we have considered the evolution of 
sunspots (umbrae and penumbrae). 

The coefficients of the neural network are constrained by comparing the output of the model 
and measurements of the solar irradiance by instruments onboard of SORCE spacecraft. The gen- 
eralization of the network is tested by dividing the data sets on two groups: (1) the training set; 
and, (2) the validation set. We have found that the model eiTor is wavelength dependent. While 
the model error for 24-hour forecast in the band from 115 to 180 nm is lower than 5%, the model 
error can reach 20% in the band from 180 to 310 nm. We speculate that the difference between the 
performance of the network for these two bands can be due to the degradation and reduction of the 
accuracy of the MUV measurements. 

We have also tested the performance of the network for different forecast periods. As expected, 
the performance of the network reduces progressively with the increase of the forecast period, 
which limits significantly the maximum forecast period that we can achieve with the discussed 
architecture. In Sect 4.3 we discuss briefly how these limitations can be overcome by employing a 
solar surface magnetic flux transport model. 

The real-time short-term forecast of the total and spectral solar irradiance is available at 
[http://www.lpc2e.cnrs-orleans.fr/~soteria The extension of the model to wavelengths above 310 
nm will be available by the end of 201 1 . 
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Fig. 7. Evolution of the filling factors of the four inner rings. Panels (a) and (b) present the evo- 
lution of the structures with filling factors lower than 16.7 ppm and between 16.7 and 24.8 ppm, 
respectively. The evolution of large structures is presented in panels (c) and (d), while the evolution 
of penumbrae and umbrae is presented in panels (e) and (f). 
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Fig. 8. Evolution of the filling factors of the four inner rings. Panels (a) and (b) present the evo- 
lution of the structures with filling factors lower than 16.7 ppm and between 16.7 and 24.8 ppm, 
respectively. The evolution of large structures is presented in panels (c) and (d), while the evolution 
of penumbrae and umbrae is presented in panels (e) and (f). 
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Wavelength = 121 .5 nm (Forecast: 24 Hours) 
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Fig. 9. Example of a 24-hour forecast training section of the neural network model for the wave- 
length band centered at 121 .5 nm (Lyman alpha). The upper panel shows the evolution of the contri- 
butions from classes II, III, and IV. The Panel (b) presents the evolution of umbrae and penumbrae. 
The panel (c) shows the evolution of the feedback of the first layer to the input of this layer. The 
panel (d) shows a comparison between the output of the model (red curve) and the observations. 
For reference, the output of the same model, but with a linear transfer function in the hidden layer 
is presented. The training and validation sets are marked in the panel (d). 
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Fig. 12. Example of a 12-hour forecast training section of the neural network model for the total 
solar irradiance. The structure of the figure is the same of the Fig.|9] 
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Fig. 13. Example of a 24-hour forecast training section of the neural network model for the total 
solar iiTadiance. The structure of the figure is the same of the Fig.|9] 



21 



Vieira et al.: Forecast of the Solar Irradiance 



Total Sfliar Irradiance (Delay; 4S Hem re) 




08/11/10 2Sn2/lO 16/02/11 07M11 27/05/11 IS/07/11 04/Oft/ll 24/10/11 



< 



1 

C.5 

-C,5 
-1 



■ Umbrae Penimbrae 



Q8/11/10 Z?/12/10 16W)2/11 0^4/11 27/05/11 15/07/11 04/09/11 34/10/11 



1 


-C.S 

-1 



1 1 1 

1 F-eedbsKSf | ; 




1 




1 












I 1 1 


L ' - 






1 1 1 


1 1 




1 



OS/11/10 2S/12/1D 16/102/11 07/04/11 27/05/11 16/07/11 04/09/11 24/10/11 



1 

OS 

■0.5 
-1 



- SrsD* Mbd!l (in(sSf> MedH (Sab.) 




Trailing SbI 

_] I I I L. 



08/11/10 26/12/10 IfeflOZ/ll 07.'04i'11 27/05/11 IS/07/11 04/09/11 24/10/11 

lime 



Fig. 14. Example of a 48-hour forecast training section of the neural network model for the total 
solar iiTadiance. The structure of the figure is the same of the Fig.|9] 
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Fig. 15. Example of a 72-hour forecast training section of the neural network model for the total 
solar iiTadiance. The structure of the figure is the same of the Fig.|9] 
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Fig. 16. Charts of the distribution of the surface magnetic field computed from the disk magne- 
tograms measured during one solar rotation. The visible disk on 201 1-1 1-14 at 10:30 is presented 
on panel (b). The color scheme follows the one for Fig. [4] 
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